计算能力和大型培训数据集的可用性增加,机器学习的成功助长了。假设它充分代表了在测试时遇到的数据,则使用培训数据来学习新模型或更新现有模型。这种假设受到中毒威胁的挑战,这种攻击会操纵训练数据,以损害模型在测试时的表现。尽管中毒已被认为是行业应用中的相关威胁,到目前为止,已经提出了各种不同的攻击和防御措施,但对该领域的完整系统化和批判性审查仍然缺失。在这项调查中,我们在机器学习中提供了中毒攻击和防御措施的全面系统化,审查了过去15年中该领域发表的100多篇论文。我们首先对当前的威胁模型和攻击进行分类,然后相应地组织现有防御。虽然我们主要关注计算机视觉应用程序,但我们认为我们的系统化还包括其他数据模式的最新攻击和防御。最后,我们讨论了中毒研究的现有资源,并阐明了当前的局限性和该研究领域的开放研究问题。
translated by 谷歌翻译
后门攻击在训练期间注入中毒样本,目的是迫使机器学习模型在测试时间呈现特定触发时输出攻击者所选的类。虽然在各种环境中展示了后门攻击和针对不同的模型,但影响其有效性的因素仍然不太了解。在这项工作中,我们提供了一个统一的框架,以研究增量学习和影响功能的镜头下的后门学习过程。我们表明,后门攻击的有效性取决于:(i)由普通参数控制的学习算法的复杂性; (ii)注入训练集的后门样品的一部分; (iii)后门触发的大小和可见性。这些因素会影响模型学会与目标类别相关联的速度触发器的存在的速度。我们的分析推出了封路计空间中的区域的有趣存在,其中清洁试验样品的准确性仍然很高,而后门攻击无效,从而提示改善现有防御的新标准。
translated by 谷歌翻译
聚类算法在决策和明智的自动化过程中发挥着基本作用。由于这些应用的广泛使用,对抗对抗性噪声的这种算法的鲁棒性分析已经成为势在必行的。然而,据我们所知,目前只有少数作品目前解决了这个问题。在尝试填补这一差距,在这项工作中,我们提出了一种黑匣子对抗性攻击,用于制作对抗性样本来测试聚类算法的稳健性。我们将问题作为一个受约束的最小化程序,一般的结构,并且根据她的能力约束,攻击者定制。我们不假设有关受害者聚类算法的内部结构的任何信息,并且我们允许攻击者仅将其查询为服务。在没有任何衍生信息的情况下,我们通过抽象遗传算法(AGA)的自定义方法进行优化。在实验部分中,我们展示了不同单一和集群聚类算法对不同情景的制作的对抗样本的敏感性。此外,我们使用最先进的方法进行了对我们的算法的比较,显示我们能够达到或甚至优于其性能。最后,为了突出生成噪声的一般性质,我们表明我们的攻击即使针对SVMS,随机林和神经网络等监督算法也可转移。
translated by 谷歌翻译
通过离散采样观测来建模连续的动力系统是数据科学中的一个基本问题。通常,这种动力学是非本地过程随时间不可或缺的结果。因此,这些系统是用插差分化方程(IDE)建模的;构成积分和差分组件的微分方程的概括。例如,大脑动力学不是通过微分方程来准确模拟的,因为它们的行为是非马克维亚的,即动态是部分由历史决定的。在这里,我们介绍了神经IDE(NIDE),该框架使用神经网络建模IDE的普通和组成部分。我们在几个玩具和大脑活动数据集上测试NIDE,并证明NIDE的表现优于其他模型,包括神经ODE。这些任务包括时间外推,以及从看不见的初始条件中预测动态,我们在自由行为的小鼠中测试了全皮质活动记录。此外,我们表明,NIDE可以通过学识渊博的整体操作员将动力学分解为马尔可夫和非马克维亚成分,我们在氯胺酮的fMRI脑活动记录中测试了动力学。最后,整体操作员的整体提供了一个潜在空间,可深入了解潜在的动态,我们在宽阔的大脑成像记录上证明了这一点。总体而言,NIDE是一种新颖的方法,可以通过神经网络对复杂的非本地动力学进行建模。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
Process monitoring and control are essential in modern industries for ensuring high quality standards and optimizing production performance. These technologies have a long history of application in production and have had numerous positive impacts, but also hold great potential when integrated with Industry 4.0 and advanced machine learning, particularly deep learning, solutions. However, in order to implement these solutions in production and enable widespread adoption, the scalability and transferability of deep learning methods have become a focus of research. While transfer learning has proven successful in many cases, particularly with computer vision and homogenous data inputs, it can be challenging to apply to heterogeneous data. Motivated by the need to transfer and standardize established processes to different, non-identical environments and by the challenge of adapting to heterogeneous data representations, this work introduces the Domain Adaptation Neural Network with Cyclic Supervision (DBACS) approach. DBACS addresses the issue of model generalization through domain adaptation, specifically for heterogeneous data, and enables the transfer and scalability of deep learning-based statistical control methods in a general manner. Additionally, the cyclic interactions between the different parts of the model enable DBACS to not only adapt to the domains, but also match them. To the best of our knowledge, DBACS is the first deep learning approach to combine adaptation and matching for heterogeneous data settings. For comparison, this work also includes subspace alignment and a multi-view learning that deals with heterogeneous representations by mapping data into correlated latent feature spaces. Finally, DBACS with its ability to adapt and match, is applied to a virtual metrology use case for an etching process run on different machine types in semiconductor manufacturing.
translated by 谷歌翻译
An Anomaly Detection (AD) System for Self-diagnosis has been developed for Multiphase Flow Meter (MPFM). The system relies on machine learning algorithms for time series forecasting, historical data have been used to train a model and to predict the behavior of a sensor and, thus, to detect anomalies.
translated by 谷歌翻译
Self-supervised learning is a popular and powerful method for utilizing large amounts of unlabeled data, for which a wide variety of training objectives have been proposed in the literature. In this study, we perform a Bayesian analysis of state-of-the-art self-supervised learning objectives and propose a unified formulation based on likelihood learning. Our analysis suggests a simple method for integrating self-supervised learning with generative models, allowing for the joint training of these two seemingly distinct approaches. We refer to this combined framework as GEDI, which stands for GEnerative and DIscriminative training. Additionally, we demonstrate an instantiation of the GEDI framework by integrating an energy-based model with a cluster-based self-supervised learning model. Through experiments on synthetic and real-world data, including SVHN, CIFAR10, and CIFAR100, we show that GEDI outperforms existing self-supervised learning strategies in terms of clustering performance by a wide margin. We also demonstrate that GEDI can be integrated into a neural-symbolic framework to address tasks in the small data regime, where it can use logical constraints to further improve clustering and classification performance.
translated by 谷歌翻译
Building a quantum analog of classical deep neural networks represents a fundamental challenge in quantum computing. A key issue is how to address the inherent non-linearity of classical deep learning, a problem in the quantum domain due to the fact that the composition of an arbitrary number of quantum gates, consisting of a series of sequential unitary transformations, is intrinsically linear. This problem has been variously approached in the literature, principally via the introduction of measurements between layers of unitary transformations. In this paper, we introduce the Quantum Path Kernel, a formulation of quantum machine learning capable of replicating those aspects of deep machine learning typically associated with superior generalization performance in the classical domain, specifically, hierarchical feature learning. Our approach generalizes the notion of Quantum Neural Tangent Kernel, which has been used to study the dynamics of classical and quantum machine learning models. The Quantum Path Kernel exploits the parameter trajectory, i.e. the curve delineated by model parameters as they evolve during training, enabling the representation of differential layer-wise convergence behaviors, or the formation of hierarchical parametric dependencies, in terms of their manifestation in the gradient space of the predictor function. We evaluate our approach with respect to variants of the classification of Gaussian XOR mixtures - an artificial but emblematic problem that intrinsically requires multilevel learning in order to achieve optimal class separation.
translated by 谷歌翻译
Aliasing is a highly important concept in signal processing, as careful consideration of resolution changes is essential in ensuring transmission and processing quality of audio, image, and video. Despite this, up until recently aliasing has received very little consideration in Deep Learning, with all common architectures carelessly sub-sampling without considering aliasing effects. In this work, we investigate the hypothesis that the existence of adversarial perturbations is due in part to aliasing in neural networks. Our ultimate goal is to increase robustness against adversarial attacks using explainable, non-trained, structural changes only, derived from aliasing first principles. Our contributions are the following. First, we establish a sufficient condition for no aliasing for general image transformations. Next, we study sources of aliasing in common neural network layers, and derive simple modifications from first principles to eliminate or reduce it. Lastly, our experimental results show a solid link between anti-aliasing and adversarial attacks. Simply reducing aliasing already results in more robust classifiers, and combining anti-aliasing with robust training out-performs solo robust training on $L_2$ attacks with none or minimal losses in performance on $L_{\infty}$ attacks.
translated by 谷歌翻译